Back

Cancer Discovery

American Association for Cancer Research (AACR)

Preprints posted in the last 7 days, ranked by how well they match Cancer Discovery's content profile, based on 61 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.

1
Drug response profiling guides precision therapy in relapsed and refractory childhood acute lymphoblastic leukemia

Steffen, F. D.; Lissat, A.; Alten, J.; Kriston, A.; Scheidegger, N.; Eckert, C.; Bodmer, N.; Schori, L.; Schühle, S.; Arpagaus, A.; Gutnik, S.; Manioti, D.; Bruderer, N.; Zeckanovic, A.; Västrik, I.; Nyiri, G.; Kovacs, F.; Thorhauge Als-Nielsen, B. E.; Attarbaschi, A.; Rademacher, A.; Elitzur, S.; Jacoby, E.; De Moerloose, B.; Svenberg, P.; Ancliff, P.; Sramkova, L.; Buldini, B.; Balduzzi, A.; Boer, J. M.; Mielcarek, M.; Ceppi, F.; Ansari, M.; Halter, J.; Schmiegelow, K.; Locatelli, F.; DelBufalo, F.; Stanulla, M.; Kulozik, A. E.; Schrappe, M.; Rohrlich, P.; Cave, H.; Baruchel, A.; von Stack

2026-04-11 oncology 10.64898/2026.04.08.26350164 medRxiv
Top 0.3%
6.1%
Show abstract

Children with relapsed or refractory acute lymphoblastic leukemia (ALL) require more effective and less toxic therapies. We established a prospective, multicenter Drug Response Profiling (DRP) registry (NCT06550102) integrating functional testing into precision-guided treatment. DRP was performed for 340 patients from 17 European countries with a turn-around time of two-weeks. Image-based drug screening with over 135000 unique perturbations revealed a heterogeneous landscape of ex vivo responses to 88 drugs on average. Ranking drug responses across the patient cohort defined individual drug fingerprints, identifying "DRP twins" by similarity in sensitivity and resistance independent of genetic ALL subtypes. Of 239 high-risk patients with follow-up, DRP-informed interventions were reported for 63 patients (26%). Patients received combination therapies based on venetoclax, tyrosine kinase inhibitors, trametinib, bortezomib or selinexor, resulting in objective clinical responses in 43 cases (68%). Precision-guided treatments allowed bridging to cellular therapies in 42 patients among whom 28 (67%) were still alive with a median follow-up of 21 months after DRP (IQR: 14.7-26.6 months). Top responders to venetoclax, ranked within the first tertile of the cohort, had superior 1-year event-survival compared to venetoclax non-responders (0.57 [95% CI, 0.39-0.85] vs. 0.25 [95% CI, 0.11-0.58]). Collectively, these findings demonstrate the feasibility and clinical relevance of functional profiling within an international network. This scalable framework enables individualized therapy selection for enrolment in adaptive precision trials for high-risk pediatric ALL.

2
Prospective Population-Scale Validation of an Electronic Health Record Based Model for Pancreatic Cancer Risk

Lahtinen, E.; Schigiltchoff, N.; Jia, K.; Kundrot, S.; Palchuk, M. B.; Warnick, J.; Chan, L.; Shigiltchoff, N.; Sawhney, M. S.; Rinard, M.; Appelbaum, L.

2026-04-13 oncology 10.64898/2026.04.11.26350318 medRxiv
Top 0.3%
4.9%
Show abstract

Background and aims: Pancreatic ductal adenocarcinoma (PDAC) surveillance is limited to individuals with familial or genetic risk although most future cases arise outside these groups. In a retrospective study, PRISM, an electronic health record (EHR)-based PDAC risk model, identified individuals in the general population at elevated near-term risk of PDAC. We aimed to prospectively evaluate whether PRISM can identify high-risk individuals beyond current surveillance groups across U.S. health systems. Methods: We performed a prospective multicenter cohort study after deployment of PRISM in April 2023 across 44 U.S. health care organizations. Eligible adults aged [≥]40 years without prior PDAC received a single baseline risk score and were assigned to prespecified risk tiers. Patients were followed for incident PDAC for 30 months. We estimated tier-specific 30-month cumulative incidence (positive predictive value, PPV), number needed to screen (NNS), standardized incidence ratios (SIRs), and time from deployment and first high-risk flag to diagnosis. Results: Among 6,282,123 adults assigned a PRISM score, 5,058,067 had follow-up; 3,609 developed PDAC. The highest-risk tier had 30-fold higher PDAC incidence than the study population. At the SIR 5 threshold, 30-month cumulative incidence was 0.35% (NNS, 284.2); at SIR 16, 1.14% (NNS, 87.4); and at SIR 30, 2.19% (NNS, 45.7). Median time from deployment to PDAC diagnosis was 9.5 months, and median time from first high-risk flag to diagnosis at SIR 5 was 3.5 years. Shapley additive explanations (SHAP) analyses supported patient- and tier-level interpretability. Conclusions: Prospective deployment of PRISM across multiple U.S. health care organizations identified individuals at elevated near-term risk for PDAC, with substantial risk enrichment and lead time before diagnosis. These findings support the real-world scalability and generalizability of EHRbased risk stratification for risk-adapted early detection. ClinicalTrials.gov identifier NCT05973331

3
Single-molecule cfDNA sequencing establishes clinical utility for ecDNA monitoring and multimodal liquid biopsy analysis

Sauer, C. M.; Tovey, N.; Ptasinska, A.; Hughes, D.; Stockton, J.; Zumalave, S.; Rust, A. G.; Lynn, C.; Livellara, V.; Sevrin, F.; Himsworth, C.; Muyas, F.; Nicolaidou, M.; Parry, G.; Paisana, E.; Cascao, R.; Ahmed, S. W.; Yasin, S. A.; Portela, L. R.; Balasubramanian, P.; Burke, G. A. A.; Vedi, A.; Faria, C. C.; Marshall, L. V.; Jacques, T. S.; Hubank, M.; Hargrave, D.; George, S.; Angelini, P.; Anderson, J.; Chesler, L.; Beggs, A. D.; Cortes-Ciriano, I.

2026-04-12 oncology 10.64898/2026.04.08.26350410 medRxiv
Top 0.4%
4.2%
Show abstract

Cell-free DNA (cfDNA) profiling enables minimally invasive cancer detection and monitoring. We present SIMMA, a low-input single-molecule sequencing approach that enables multimodal whole-genome and high-depth targeted sequencing of the same cfDNA sample for both tumour-agnostic and tumour-informed liquid biopsy analysis. Across 792 plasma and cerebrospinal fluid cfDNA samples from 277 paediatric patients with diverse brain and extracranial tumours, SIMMA enabled tumour diagnosis, detection of driver mutations, and reconstruction of extrachromosomal DNA (ecDNA) months before clinical relapse. Using conformal prediction trained on genome-wide fragmentomics, genomic and epigenomic data, SIMMA predicts disease burden as a continuous variable and provides well-calibrated uncertainty estimates for each sample, achieving a limit of detection of [~]100 ppm from low-pass whole-genome sequencing data. In summary, SIMMA establishes the clinical utility of multimodal cfDNA profiling with uncertainty quantification for individual patients and unlocks the potential of ecDNA as a liquid biopsy biomarker for disease detection and monitoring across diverse aggressive malignancies.

4
Shared inheritance reveals landscape of somatic and germline cancer risk in TP53

MacGregor, H. A. J.; Blundell, J. R.; Easton, D. F.

2026-04-11 genetic and genomic medicine 10.64898/2026.04.10.26350605 medRxiv
Top 0.9%
2.0%
Show abstract

Pathogenic variants in TP53, the key tumour-suppressor gene underlying Li-Fraumeni syndrome (LFS), are among the best-established causes of inherited cancer predisposition. However, large-scale sequencing has revealed that many apparently pathogenic TP53 variants detected in blood are the result of somatic clonal expansions, complicating risk interpretation. Using blood-derived whole-exome data from 469,391 UK Biobank participants, we combined variant allele fraction (VAF) with haplotype-sharing analysis to distinguish germline and somatic TP53 variants. Germline variants were concentrated at sites linked to partial loss of p53 function and lower disease penetrance, whereas classic LFS alleles appeared almost entirely somatic. High-VAF carriers of classic LFS alleles conferred markedly increased risk of haematological malignancy but not solid tumours, consistent with large TP53-mutant clonal expansions. The prevalence of somatic clonal expansion also correlated with missense variant pathogenicity, suggesting that somatic activity provides an informative in vivo proxy for functional impact. These results provide new insights into TP53-associated cancer risk at the population level, demonstrate that somatic rather than germline risk predominates in middle-aged healthy adults and provide a scalable framework for variant classification in large-scale population genomics.

5
Functional PD-1/PD-L1 engagement defines a spatial biomarker of immunotherapy response

Ullman, T.; Krantz, D.; Avenel, C.; Lung, M.; Svedman, F. C.; Holmsten, K.; Ostling, P.; Ullen, A.; Stadler, C.

2026-04-17 oncology 10.64898/2026.04.15.26350929 medRxiv
Top 1%
1.7%
Show abstract

Effective predictive biomarkers for immune checkpoint inhibitor (ICI) therapy remain an unmet need across solid tumors. Here, we present an integrated spatial proteomics workflow that combines in situ proximity ligation assay with multiplexed immunofluorescence to directly resolve PD1/PDL1 signaling events at the level of defined cellular phenotypes and their spatial organization within intact tumor tissue. Applied as a proof of concept to tumor samples from patients with metastatic urothelial carcinoma treated with pembrolizumab, this approach reveals that PD1/PDL1 interactions specifically involving cytotoxic CD8CD3 T cells are significantly enriched in complete responders, while such interactions are rare in patients with progressive disease. This interaction defined T cell subset achieves superior discrimination of clinical response compared to single marker PDL1 expression or immune cell abundance alone. By integrating direct detection of protein protein interactions with high dimensional single cell phenotyping, our workflow provides a mechanistically informed, spatially resolved biomarker of functional immune engagement. Beyond urothelial carcinoma, this platform establishes a generalizable framework for translating spatial signaling biology into predictive tools for immunotherapy response across tumor types.

6
NMF Deconvolution of a High-ROS Transcriptional Program Uncovers mTOR-Dependent Therapeutic Sensitivity in Stomach Adenocarcinoma

Roy, R.; Patnaik, J.; Chakraborty, A.; Patnaik, S.; Parija, T.

2026-04-16 oncology 10.64898/2026.04.12.26350699 medRxiv
Top 1%
1.7%
Show abstract

Background: Stomach adenocarcinoma is driven by heterogeneity, limiting therapeutic success. Although ROS acts as a continuous redox rheostat for tumor evolution, it is categorized based on binary models that are masked by tumor-microenvironment (TME) confounders. Here, we have defined a continuous, TME-independent ROS axis to help identify intrinsic vulnerabilities and improve patient stratification. Methods: Non-negative matrix factorization (NMF) defined a ROS-Axis in TCGA-STAD which was validated in ACRG Cohort. Multivariate regression model isolated intrinsic signatures via residual ROS scores by adjusting for TME confounders. Survival was assessed using Cox hazard models. Drug sensitivities were mapped using GDSC2/ElasticNet modeling with cross-cohort replication. Results: Our results define a reproducible ROS gradient, driven by effectors like NQO1 and SOD1, characterizing ROS-high tumors as proliferative, epithelial and immune -cold. High residual ROS score was associated with an improved prognosis, regardless of TNM stage and age. Pharmacogenomic mapping revealed an overlapping sensitivity to mTOR inhibitors in ROS-high gastric cancer tumors which persisted after TME confounder adjustment. Conclusion: The continuous ROS axis provides a functional readout of metabolic dependency that refines traditional anatomical staging. By identifying mTOR dependent cold tumors, our framework offers a precision strategy for immunotherapy-resistant patients like those affected by microsatellite-stable gastric cancer.

7
Five-Domain Accelerometer-Derived Behavioral Exposome and Incident Cancer Risk in UK Biobank

Ni Chan Chin (Chengqin Ni), M.; Berrio, J. A.

2026-04-12 epidemiology 10.64898/2026.04.07.26350369 medRxiv
Top 1%
1.7%
Show abstract

BackgroundAccelerometer-derived behavioral phenotype captures multidimensional aspects of human behavior extending well beyond physical activity, encompassing light exposure, step counts, physical activity patterns, sleep, and circadian rhythms. Whether these five domains constitute a unified behavioral architecture underlying cancer risk and whether circadian organization and light exposure confer incremental predictive value beyond movement volume alone remains to be comprehensively established. MethodsWe conducted an accelerometer-wide association study (AWAS) encompassing the complete accelerometer-derived behavioral exposome across five behavioral domains in UK Biobank participants with valid wrist accelerometry data. Incident solid cancers were designated as the primary endpoint, with prespecified site-specific solid cancers and hematological malignancy as secondary outcomes. Cox proportional hazards models with age as the timescale were used. The minimal covariate set served as the primary reporting tier, followed by sensitivity analyses additionally adjusting for adiposity/metabolic factors, independent activity patterns, shift work history, and accelerometry measurement quality. Nominal statistical significance was defined as two-sided P < 0.05 ResultsAmong 89,080 participants, 6,598 incident solid cancer events were observed over a median follow-up of 8.39 years. In the minimally adjusted model, the pan-solid-tumor association atlas was dominated by signals from activity volume, inactivity fragmentation, and circadian rhythm. Higher overall acceleration (HR per SD: 0.91, 95% CI: 0.89-0.94) and higher daily step counts (HR: 0.93, 95% CI: 0.90-0.95) were independently associated with reduced solid cancer risk, while inactivity fragmentation metrics were consistently linked to higher risk. Notably, circadian rhythms, most prominently cosinor mesor (Midline Estimating Statistic of Rhythm under cosinor model), emerged as leading inverse risk signals, underscoring the independent contribution of circadian behavioral architecture. Site-specific analyses revealed pronounced heterogeneity across tumor sites. Lung cancer exhibited a robust inverse activity-risk gradient, while breast cancer showed reproducible associations with MVPA. Most strikingly, nocturnal light exposure demonstrated a tumor-site-specific association confined to pancreatic cancer, a signal absent across all other sites examined. Associations for uterine cancer were predominantly inactivity-related and substantially attenuated following adjustment for adiposity and metabolic factors. ConclusionsAcross five accelerometer-derived behavioral domains, solid cancers as a whole were most consistently associated with a high-movement, low-fragmentation, and circadian-coherent behavioral profile. While site-specific heterogeneity exists, the broad cancer risk landscape is dominated by movement volume, inactivity fragmentation, and circadian rhythmicity. Light exposure, although more localized in its contribution, demonstrates a potentially novel and specific association with pancreatic cancer risk. These findings support a five-domain behavioral exposome framework for cancer epidemiology and, importantly, position circadian rhythm integrity and nocturnal light exposure as critically understudied dimensions warranting dedicated mechanistic investigation.

8
Heterogeneous, Population-Level Drug-Tolerant Persisters Exhibit Ion-Channel Remodeling and Ferroptosis Susceptibility

Hayford, C. E.; Baleami, B.; Stauffer, P. E.; Paudel, B. B.; Al'Khafaji, A.; Brock, A.; Quaranta, V.; Tyson, D. R.; Harris, L. A.

2026-04-13 systems biology 10.1101/2022.02.03.479045 medRxiv
Top 1%
1.7%
Show abstract

Drug-tolerant persisters (DTPs) represent a major obstacle to durable responses in targeted cancer therapy. DTPs are commonly described as distinct single-cell states that survive drug treatment via reversible, non-genetic mechanisms and drive tumor recurrence. Recent work demonstrates that multiple DTPs can coexist, reflecting diversity in lineage, signaling programs, or stress responses. However, each DTP is still generally viewed as a uniform cellular phenotype. Building on our prior work describing a population-level DTP termed "idling" [Paudel et al., Biophys. J. (2018) 114, 1499-1511], here we present evidence supporting a fundamentally different view: that DTPs are not single-cell states, but rather heterogeneous populations composed of multiple sub-states with distinct division and death rates that balance to produce near-zero net population growth. Using single-cell transcriptomics and lineage barcoding, we identify multiple phenotypic states within idling DTP populations, with reduced heterogeneity compared to untreated populations, and find that idling DTP cells emerge from nearly all lineages. Transcriptomic and functional analyses further reveal altered ion-channel activity in idling DTPs, which we confirm experimentally. Moreover, drug-response assays reveal increased susceptibility of idling DTPs to ferroptosis, a non-apoptotic form of regulated cell death, indicating the emergence of vulnerabilities associated with drug tolerance. Altogether, our results support a population-level view of tumor drug tolerance in which DTPs comprise stable collections of phenotypic states, shaped by treatment-defined phenotypic landscapes, which are potentially vulnerable to subsequent interventions. This perspective implies that eradicating DTPs will require a fundamental shift away from cell-type-centric strategies toward sequential treatments that progressively reduce phenotypic heterogeneity by modulating the molecular and cellular processes that establish the DTP landscape, an approach previously termed "targeted landscaping."

9
Vaccine-induced antibody and T cell responses in children with acute lymphoblastic leukemia

Shapiro, J. R.; Dorogy, A.; Science, M.; Gupta, S.; Alexander, S.; Bolotin, S.; Watts, T. H.

2026-04-12 oncology 10.64898/2026.04.10.26350531 medRxiv
Top 1%
1.4%
Show abstract

Children with acute lymphoblastic leukemia (ALL) are treated with multiagent chemotherapy that causes profound changes to the immune system. There are limited data on how disease and therapy impact antigen-specific immune memory, leading to inconsistent guidelines on best practices for revaccination of this population. Here, to inform vaccine guidance, we investigated whether immunity derived from routine childhood measles and varicella zoster virus (VZV) vaccines is maintained during and after therapy for childhood ALL. We report that antibodies against measles and VZV were significantly reduced in children with ALL (n=45) compared to healthy controls (n=13), particularly in older children in whom a longer time had passed since their most recent vaccine dose. However, the avidity of the measles and VZV-specific antibodies was indistinguishable between groups. Despite changes to the composition of the T cell compartment, both overall and antigen-specific T cell function were preserved in children with ALL. These data provide compelling evidence for revaccination of children following ALL treatment. Intact T cell responses suggest that post-treatment revaccination would be effective.

10
Mutation timing, accumulation and selection in the male germline shape inheritance risk for developmental disorders

Neville, M. D. C.; Neuser, S.; Sanghvi, R.; Christopher, J.; Roberts, K.; Smith, K.; ONeill, L.; Hayes, J.; Cagan, A.; Hurles, M. E.; Goriely, A.; Abou Jamra, R.; Rahbari, R.

2026-04-13 genetic and genomic medicine 10.64898/2026.04.09.26350474 medRxiv
Top 1%
1.2%
Show abstract

De novo mutations (DNMs) arising in the parental germline are a major cause of severe developmental disorders. While most DNMs originate in the paternal germline, it remains unclear whether fathers of affected children carry a systematically altered burden of transmissible germline risk, or whether disease largely reflects stochastic outcomes of shared population-wide mutational processes. Here, we combined whole-genome sequencing of 168 parent-child trios with ultra-accurate duplex sequencing of paternal sperm to directly relate transmitted DNMs to the broader mutational and selective landscape of the male germline. In 127 fathers, sperm mutation burden and mutational spectra were indistinguishable from population reference cohorts. Positive selection metrics were likewise concordant, with a global dN/dS of 1.56 (95% CI 1.45-1.67) compared to 1.44 (95% CI 1.17-1.77) in controls and 28 of 32 significantly selected genes overlapping with prior findings. Six fathers harboured a pathogenic early mosaic variant detectable in sperm at allele fractions that ranged from 0.7% to 14.8%. Although these variants generated substantial individual-level risk outliers, they accounted for only [~]11% of the aggregated exome pathogenic burden across the cohort. The remaining burden was distributed across low-VAF mutations, including positively selected driver variants and other rare mutations accumulating with paternal age. Together, these results show that transmissible de novo disease risk is governed primarily by universal germline mutational and selective processes, while early developmental mosaicism produces uncommon but clinically meaningful deviations. This integrated view clarifies how mutation timing, age-associated accumulation and germline selection jointly shape inheritance risk.

11
Characterization of a pancreatic cancer GWAS signal suggests PDX1 buffers stress in the exocrine pancreas

Hoskins, J. W.; Christensen, T. A.; Eiser, D.; Char, E.; Mobaraki, M.; O'Brien, A.; Collins, I.; Zhong, J.; Patel, M. B.; Prasad, G.; Pancreatic Cancer Cohort Consortium and Pancreatic Cancer Case-Control Consortium (PanScan/PanC4), ; Arda, E.; Connelly, K. E.; Amundadottir, L. T.

2026-04-15 genetic and genomic medicine 10.64898/2026.04.13.26350790 medRxiv
Top 2%
0.9%
Show abstract

Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest human cancers. The current largest published PDAC Genome-Wide Association Study (GWAS) identified 23 genetic risk signals, but most lack sufficient characterization. This study aimed to functionally characterize the chr13q12.2 (PLUT/PDX1) PDAC GWAS risk locus. Fine-mapping, luciferase reporter assays, and electrophoretic mobility shift assays implicated rs9581943, a PDX1 promoter SNP, as a functional variant underlying this GWAS signal. GTEx expression QTL analyses identified rs9581943 as a significant PDX1 eQTL in pancreas, and CRISPR/Cas9 editing in PDAC-derived cell lines confirmed a functional relationship. PDX1 is a transcription factor involved in early pancreas development and {beta}-cell homeostasis, but its role in exocrine pancreatic cells is unclear. Single-nucleus RNA-seq analyses of pancreatic acinar and ductal cells from neonatal, adult, and chronic pancreatitis donors suggested PDX1 activity alleviates high secretory load and ER-stress in acinar and biases ducts toward homeostatic phenotypes. Similarly, scRNA-seq analyses of pancreatic tumors suggested PDX1 activity reduces biosynthetic and inflammatory stress and promotes epithelial differentiation. Our study therefore implicates rs9581943 as a causal variant for the chr13q12.2 PDAC GWAS signal wherein the risk allele reduces PDX1 expression, eroding PDX1's capacity to buffer stress and stabilize epithelial cell fate in the exocrine compartment.

12
Why Invariant Risk Minimization Fails on TabularData: A Gradient Variance Solution

Mboya, G. O.

2026-04-13 epidemiology 10.64898/2026.04.09.26350513 medRxiv
Top 2%
0.9%
Show abstract

Machine learning models trained on observational data from one environment frequently fail when deployed in another, because standard learning algorithms exploit spurious correlations alongside causal ones. Invariant learning methods address this problem by seeking representations that support stable prediction across training environments, but their behavior on tabular data remains poorly characterized. We present CausTab, a gradient variance regularization framework for causal invariant representation learning on mixed tabular data. CausTab penalizes the variance of parameter gradients across training environments, providing a richer invariance signal than the scalar penalty used by Invariant Risk Minimization (IRM). We provide formal results showing that the gradient variance penalty is zero at causally invariant solutions and positive at solutions that rely on spurious features. Through experiments on synthetic data across three spurious-correlation regimes, four cycles of the National Health and Nutrition Examination Survey (NHANES), and four hospital systems in the UCI Heart Disease dataset, we demonstrate that: (1) IRM consistently degrades relative to standard empirical risk minimization (ERM) on tabular data, losing up to 13.8 AUC points in spurious-dominant settings, a failure we trace mechanistically to penalty collapse during training; (2) CausTab matches or exceeds ERM in every experimental condition; (3) CausTab achieves consistently better probability calibration than both ERM and IRM; and (4) invariant learning methods fail when environments differ in outcome prevalence rather than in spurious feature correlations, a boundary condition we characterize both empirically and theoretically. We introduce the Spurious Dominance Index (SDI), a practical scalar diagnostic for determining whether a dataset requires invariant learning, and validate it across all experimental settings

13
Virtual Spectral Decomposition with Dendritic Binary Gating Detects Pancreatic Cancer Tissue Transformation on Standard CT: Multi-Institutional Validation Across Three Independent Datasets with a 3.8-Year Pre-Diagnostic Detection Window

Chandra, S.

2026-04-12 oncology 10.64898/2026.04.08.26350418 medRxiv
Top 2%
0.8%
Show abstract

Background. Pancreatic ductal adenocarcinoma (PDAC) has a five-year survival rate of approximately 12%, largely because it is typically diagnosed at an advanced stage. CT-based computational methods for early detection exist but rely on black-box deep learning or large texture feature sets without tissue-specific interpretability. Methods. We developed Virtual Spectral Decomposition (VSD), which applies six parameterized sigmoid functions S(HU) = 1/(1+exp(-alpha x (HU - mu))) to standard portal-venous CT, decomposing each pixel into tissue-specific response channels for fat (mu=-60), fluid (mu=10), parenchyma (mu=45), stroma (mu=75), vascular (mu=130), and calcification (mu=250). Dendritic Binary Gating identifies structural content per channel using morphological filtering, enabling co-firing analysis and lone firer identification. A 25-feature signature was extracted per patient. Three independent datasets were analyzed: NIH Pancreas-CT (n=78 healthy), Medical Segmentation Decathlon Task07 (n=281 PDAC, paired tumor/adjacent tissue), and CPTAC-PDA from The Cancer Imaging Archive (n=82, multi-institutional, with DICOM time point tags). The same six sigmoid parameters were used across all datasets without retraining. Results. VSD achieved AUC 0.943 for field effect detection (healthy vs cancer-adjacent parenchyma) and AUC 0.931 for patient-stratified tumor specification on MSD. On CPTAC-PDA, VSD achieved AUC 0.961 (6 features) and 0.979 (25 features) for distinguishing healthy from cancer-bearing pancreas on scans obtained prior to pathological diagnosis. All significant features replicated across datasets in the same direction: z_fat (d=-2.10, p=3.5e-27), z_fluid (d=-2.76, p=2.4e-38), fire_fat (d=+2.18, p=1.2e-28). Critically, VSD severity did not correlate with days-from-diagnosis (r=-0.008, p=0.944) across a range of day -1394 to day +249. Patient C3N-01375, scanned 3.8 years before pathological diagnosis, had VSD severity 1.87, well above the healthy mean of 0.94 +/- 0.33. The tissue transformation signature was temporally stable, indicating an early, persistent tissue state rather than a progressively worsening process. Conclusions. VSD with Dendritic Binary Gating detects a stable pancreatic tissue composition signature on standard CT that is present years before clinical diagnosis, validated across three independent datasets without parameter adjustment. The six sigmoid channels map to biologically meaningful tissue components through a fully transparent interpretability chain. The temporal stability of the signal implies a detection window of 3-7 years, consistent with known PanIN-3 microenvironment transformation timelines. VSD functions as a single-scan screening tool applicable to any abdominal CT performed during the pre-clinical window.

14
Inherited genetic risk factors in young-onset lung cancer

Esai Selvan, M.; Gould Rothberg, B. E.; Patel, A. A.; Sang, J.; Horowitz, A.; Christiani, D. C.; Klein, R. J.; Gumus, Z. H.

2026-04-15 genetic and genomic medicine 10.64898/2026.04.14.26350822 medRxiv
Top 2%
0.7%
Show abstract

Introduction Lung cancer is rare before age 45, and its inherited genetic basis remains poorly defined. Methods We performed whole-genome sequencing in 171 predominantly young-onset lung cancer patients and integrated these data with whole-exome sequencing from six major lung cancer consortia, yielding 9,065 patients. After quality control, analyses focused on 6,545 individuals of European ancestry, the largest ancestral group. We compared the prevalence of rare pathogenic and likely pathogenic (P/LP) germline variants between 186 young-onset (age <45 years) and 6,359 older patients at gene and gene-set levels using Fisher's exact test, stratified by histology, sex, and smoking status. Polygenic risk scores (PRS) derived from common variants were also evaluated. Results Young-onset patients carried a higher burden of rare germline P/LP variants in DNA damage response (DDR) genes (including BRIP1, ERCC6, MSH5), and in cilia-related genes, notably GPR161. At the pathway level, DDR genes were significantly enriched (OR=1.66, p=0.007), with the strongest signal in the Fanconi Anemia pathway and among females (OR=1.96, p=0.01). Enrichment was also observed in inborn errors of immunity pathways, with strongest signals in antibody deficiency and the complement system genes. Young-onset patients additionally exhibited higher lung cancer PRS. Conclusion Young-onset lung cancer exhibits a distinct germline genetic architecture, characterized by enrichment of rare P/LP variants in DDR, cilia-related, and immune pathways, and an elevated lung cancer PRS. These findings support a greater role for inherited susceptibility in early-onset disease and have implications for risk stratification, earlier screening, and precision prevention.

15
OCA-B/Pou2af1 Expression in T Cells Promotes PD-1 Blockade-Induced Autoimmunity but is Dispensable for Anti-Tumor Immunity

Du, J.; Manna, A. K.; Medina-Serpas, M. A.; Hughes, E. P.; Bisoma, P.; Evason, K. J.; Young, A.; Wilson, W. D.; Brusko, T.; Farahat, A. A.; Tantin, D.

2026-04-16 immunology 10.1101/2025.10.22.683978 medRxiv
Top 2%
0.7%
Show abstract

The transcription coregulator OCA-B promotes CD4+ T cell memory recall responses and autoimmunity. OCA-B T cell deletion prevents spontaneous type-1 diabetes (T1D) onset in non-obese diabetic (NOD) mice and blunts T1D in a subset of more aggressive models. However, the role of OCA-B in diabetes induced by treatment with immune checkpoint inhibitors (ICIs), and the role of OCA-B in the control of tumors with and without ICI treatment, has not been studied. Here we show that islet and pancreatic lymph node T cells from T1D individuals express measurable POU2AF1 mRNA. Deletion of OCA-B in T cells fully insulates 8-week-old non-obese diabetic (NOD) mice against ICI-induced diabetes and partially protects 12-week-old mice. Salivary and lacrimal gland infiltration and inflammation were also reduced. Protection was associated with a block in the differentiation of progenitor exhausted CD8+ T cells (TPEX) into terminally exhausted CD8+ T cells (TEX). We show that OCA-B T cell loss preserves anti-tumor immune responses following PD-1 blockade in different tumors and mouse strains. These findings point to a potential therapeutic window in which pharmaceuticals targeting OCA-B could be used to block the emergence of both spontaneous and ICI-induced autoimmunity while sparing anti-tumor immunity. We develop first-in-class small molecule inhibitors of Oct1/OCA-B transcription complexes and show that administration into NOD mice also blocks diabetes emergence following PD-1 blockade. These results identify OCA-B as a promising therapeutic target for the prevention of autoimmunity and immune-related adverse events (irAEs).

16
A Conversational Artificial Intelligence Framework for Comparative Pathway-Level Profiling of Sezary Syndrome and Primary Cutaneous CD8+ Aggressive Epidermotropic Cytotoxic T-Cell Lymphoma (PCAECTCL)

Diaz, F. C.; Waldrup, B.; Carranza, F. G.; Manjarrez, S.; Velazquez-Villarreal, E.

2026-04-17 oncology 10.64898/2026.04.15.26350992 medRxiv
Top 2%
0.7%
Show abstract

Background: Sezary syndrome (SS) is an aggressive leukemic variant of cutaneous T-cell lymphoma (CTCL) with distinct clinical and biological features compared to rarer entities such as primary cutaneous CD8+ aggressive epidermotropic cytotoxic T-cell lymphoma (PCAECTCL). Although recurrent genomic alterations in CTCL have been described, comparative analyses at the pathway level across biologically divergent subtypes remain limited. Here, we leveraged a conversational artificial intelligence (AI) platform for precision oncology to enable rapid, integrative, and hypothesis-driven interrogation of publicly available genomic datasets. Methods: We conducted a secondary analysis of somatic mutation and clinical data from the Columbia University CTCL cohort accessed via cBioPortal. Cases were stratified into SS (n=26) and PCAECTCL (n=13). High-confidence coding variants were curated and mapped to biologically relevant signaling pathways and functional gene categories implicated in CTCL pathogenesis. Pathway-level mutation frequencies were compared using Chi-square or Fisher's exact tests, with effect sizes quantified as odds ratios. Tumor mutational burden (TMB) was compared using the Wilcoxon rank-sum test. Subtype-specific co-mutation patterns were evaluated using pairwise association analyses and visualized through oncoplots and network heatmaps. Conversational AI agents, AI-HOPE, were used to iteratively refine cohort definitions, prioritize pathway-level signals, and contextualize findings. Results: TMB was comparable between SS and PCAECTCL (p = 0.96), indicating no significant difference in global mutational load. In contrast, pathway-centric analyses revealed marked qualitative differences. SS demonstrated enrichment of alterations in epigenetic regulators, tumor suppressor and cell-cycle control pathways, NFAT signaling, and DNA damage response mechanisms, consistent with transcriptional dysregulation and immune modulation. PCAECTCL exhibited relatively higher frequencies of alterations involving epigenetic regulators and MAPK pathway signaling, suggesting distinct oncogenic dependencies. Co-mutation analysis revealed a more constrained and focused interaction landscape in SS, whereas PCAECTCL displayed broader and more heterogeneous co-mutation networks, indicative of divergent evolutionary trajectories. Notably, ERBB2 mutations were significantly enriched between subtypes (p = 0.031), highlighting a potential subtype-specific therapeutic vulnerability. Conclusions: This study demonstrates that SS is distinguished from PCAECTCL not by increased mutational burden but by distinct pathway-level architectures, particularly involving epigenetic regulation, immune signaling, and transcriptional control. These findings generate biologically grounded, testable hypotheses for subtype-specific therapeutic targeting and underscore the value of conversational AI as a scalable framework for accelerating discovery in translational cancer genomics.

17
Vector2Variant: Discovery of Genetic Associations from ML Derived Representations without Phenotype Engineering

Sooknah, M.; Srinivasan, R.; Sankarapandian, S.; Chen, Z.; Xu, J.

2026-04-17 genetic and genomic medicine 10.64898/2026.04.10.26350624 medRxiv
Top 2%
0.7%
Show abstract

Genome-wide association studies (GWAS) have transformed our understanding of human biology, but are constrained by the need for predefined phenotypes. We introduce Vector2Variant (V2V), a general-purpose framework that transforms any set of high-dimensional measurements (such as machine learning embeddings) into a genome-wide scan for associations, without requiring rigid specification of a phenotype. Rather than testing genetic variants against single traits, V2V finds the axis in multivariate space along which carriers and non-carriers maximally differ, and produces a continuous "projection phenotype" that can be interpreted by association with disease labels. The projection phenotypes correlate with orthogonal clinical biomarkers never seen during training, suggesting the learned axes capture biologically meaningful variation. We applied V2V to imaging, timeseries, and omics modalities in the UK Biobank and recovered established biology (like the role of CASP9 in renal failure) without the need for targeted measurements, alongside novel associations including a frameshift variant in LRRIQ1 (potentially protective for cardiovascular disease). V2V is computationally efficient at genome-wide scale, producing summary statistics and disease associations that facilitate target prioritization without the need for phenotype engineering.

18
Virtual Spectral Decomposition of Plasma Biomarkers for Non-Invasive Detection of Cerebral Amyloid Pathology: A Multi-Channel Framework with Disease-Exclusion Logic

Chandra, S.

2026-04-15 neurology 10.64898/2026.04.14.26350885 medRxiv
Top 2%
0.5%
Show abstract

Background. Detection of cerebral amyloid pathology currently requires amyloid PET imaging ($5,000-$8,000) or cerebrospinal fluid analysis via lumbar puncture, procedures that are inaccessible for population-level screening. The FDA-cleared Lumipulse G pTau217/Abeta1-42 plasma ratio test (May 2025) represents the first approved blood-based alternative; however, single-ratio approaches cannot distinguish Alzheimer's disease (AD) from non-AD neurodegeneration or provide multi-dimensional disease characterization. Methods. We developed Virtual Spectral Decomposition (VSD), a framework that decomposes plasma biomarker profiles into biologically interpretable diagnostic channels. Four plasma biomarkers - phosphorylated tau-217 (pTau217), amyloid-beta42/40 ratio, neurofilament light chain (NfL), and glial fibrillary acidic protein (GFAP) - were measured in 1,139 Alzheimer's Disease Neuroimaging Initiative (ADNI) participants. Each biomarker was mapped to a VSD channel representing a distinct pathophysiological axis: tau/amyloid phosphorylation, amyloid clearance, neurodegeneration, and astrocytic activation. Channel weights were calibrated via logistic regression, and performance was evaluated against amyloid PET (UC Berkeley) using 10x5-fold repeated cross-validation. Results. VSD 4-channel fusion achieved AUC = 0.900 (+/-0.018), exceeding pTau217 alone (0.888+/-0.022). Optimal sensitivity was 89.7% with 78.1% specificity (NPV = 90.8%). The NfL channel received a negative weight (beta = -1.1), functioning as a disease-exclusion signal: elevated neurodegeneration without amyloid-tau coupling actively reduces the AD probability, distinguishing AD from non-AD neurodegeneration. Complementary CSF proteomics analysis (7,008 proteins, 533 participants) identified 17 amyloid-specific proteins (0.24% of the proteome), revealing a 49:1 tau-to-amyloid asymmetry that explains why blood-based tau markers outperform amyloid markers. Conclusions. Blood-based VSD provides an interpretable, multi-channel framework for amyloid detection that incorporates explicit disease-exclusion logic unavailable to single-biomarker approaches. The architecture extends to multi-disease screening, where the same blood specimen could be routed through disease-specific modules for AD, Parkinson's disease, and cancer.

19
Integrated mapping of human meniscus and cartilage eQTLs reveals shared and distinct osteoarthritis genetic drivers

Uchida, Y.; Fujii, Y.; Swahn, H.; Ueda, M. T.; Chiba, T.; Matsushima, T.; Naito, Y.; Nakamichi, R.; Takahashi, K.; Olmer, M.; The RE-JOIN Consortium Investigators, ; Lotz, M.; Kochi, Y.; Asahara, H.

2026-04-16 orthopedics 10.64898/2026.04.12.26350702 medRxiv
Top 2%
0.5%
Show abstract

Osteoarthritis (OA) is a prevalent musculoskeletal disorder and a leading cause of global disability. Although meniscal damage is a major risk factor of OA pathogenesis, genetic regulatory studies have remained largely confined to articular cartilage. Here, we establish the first comprehensive expression quantitative trait locus (eQTL) map integrating whole-genome sequencing and bulk transcriptomics from human meniscus (n=112) and cartilage (n=113). Supported by single-nucleus multiomics (cartilage: 56,549 nuclei; meniscus: 34,343 nuclei), we uncovered highly tissue-specific genetic risk architectures. Colocalization with OA GWAS identified 27 meniscus-specific, 28 shared, and 20 cartilage-specific causal genes. Chromatin-informed fine-mapping and deconvolution elucidated distinct pathogenic mechanisms; notably, meniscus-specific signals converged on VEGFA via rare promoter variants and an enhancer in fibrochondrocyte progenitors, alongside a shared eQTL for CLEC18A. Exploratory analysis suggested candidate compounds to reverse pathogenic gene expression. Our findings underscore the meniscus as a distinct genetic driver, molecularly reinforcing OA as an entire joint organ failure.

20
Positive Selection Screen Identifies Natural Product β-Catenin Inactivators

Boudreau, M. W.; Freire, V. F.; Corbett, S. C.; Martinez-Fructuoso, L.; Shenoy, S. R.; Yu, W.; Kumar, R.; Thornburg, C. C.; Akee, R. K.; Peyser, B. D.; Jiang, Q.; Splaine, J.; Pfaff, J. L.; Chandler, B. C.; Abeja, D. M.; Donovan, K. A.; Che, J.; Lampson, B. L.; Cooke, M.; Kazanietz, M. G.; Szajner, P.; Smith, J. A.; Koduri, V.; Grkovic, T.; OKeefe, B. R.; Kaelin, W. G.

2026-04-17 cancer biology 10.1101/2025.08.27.671140 medRxiv
Top 3%
0.5%
Show abstract

Many genetically validated targets in cancer, including the transcription factor {beta}-catenin ({beta}-cat), have historically been viewed as undruggable. Cell-based phenotypic screening of chemical compounds can reveal new biological and pharmacological principles. Natural products are powerful probes because of their superior structural diversity, drug-like properties, and biological activities as compared to unoptimized synthetic compounds. We screened 326,304 natural product mixtures (40,744 extracts and 285,560 fractions derived from them) using mammalian cells expressing an oncogenic version of {beta}-cat fused to a suicide protein. Multiple fractions degraded the {beta}-cat fusion protein or drove it into a compartment where both fusion partners were apparently inactive. The active natural product from one of the latter specifically activates novel, but not classical, protein kinase Cs (PKCs) and thereby relocates {beta}-cat to juxtamembrane vacuolar structures. These findings suggest a path for inactivating oncogenic {beta}-cat and underscore the power of screening natural product collections with robust phenotypic assays.